[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names #25489

MatthewBonanni · 2025-09-23T16:24:21Z

Purpose

Remove _VLLM_V1 suffixes as these are no longer needed. If a user includes that suffix in their VLLM_ATTENTION_BACKEND environment variable, it is stripped and a warning is printed.

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request effectively removes the _VLLM_V1 suffixes from attention backend names, which is a great cleanup now that V0 backends are deprecated. The changes are consistent across the codebase, covering test files, configuration, and backend implementations. I appreciate the addition of backward compatibility in vllm/attention/selector.py to handle the old suffixes from environment variables with a warning. I've found one minor area for improvement to simplify a redundant conditional check.

vllm/platforms/tpu.py

tlrmchlsmth

This is a breaking change right? We should put in a deprecation notice + fallback for at least a release

MatthewBonanni · 2025-09-23T17:21:37Z

vllm/attention/selector.py

+            if backend_by_env_var.endswith("_VLLM_V1"):
+                logger.warning(
+                    "The suffix '_VLLM_V1' in the environment variable "
+                    "%s is no longer necessary as V0 backends have been "
+                    "deprecated. Please remove this suffix from your "
+                    "environment variable setting.", STR_BACKEND_ENV_VAR)
+                backend_by_env_var = backend_by_env_var.removesuffix(
+                    "_VLLM_V1")


@tlrmchlsmth It shouldn't be breaking; these lines should prevent issues

mgoin · 2025-09-24T01:28:28Z

Failures are related, PTAL

…ction logic) Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

mgoin

LGTM with the test changes, thanks a ton!

mgoin · 2025-09-25T03:35:03Z

The v1 tests look related too

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

MatthewBonanni · 2025-09-25T13:16:33Z

@mgoin Thanks - I removed that test case too (there are no longer any V0 backends to fall back to). Eventually we'll have to remove _is_v1_supported_oracle and that full test_oracle.py file, but that'll be a different PR

MatthewBonanni · 2025-09-25T13:35:16Z

Created #25673 to remove the oracle

MatthewBonanni · 2025-09-25T13:59:09Z

@mgoin V1 Test e2e + engine is failing for backend FLASHINFER. This is also a test that should be failing on main. It was previously running with backend FLASHINFER_VLLM_V1, which didn't exist, so it'd fall back to FlashAttention, which passes (see recent nightly). Now, it correctly runs with FlashInfer, which fails. Not sure what to do about this one.

ProExpertProg · 2025-09-25T14:38:38Z

@MatthewBonanni is it possible to fix the test? Otherwise you can make an issue and disable it

MatthewBonanni · 2025-09-25T15:05:16Z

@ProExpertProg It looks to me like a correctness issue with the FI backend outside the scope of this PR, so I'll disable it and make an issue. Disabling it shouldn't do too much harm because it was effectively disabled before.

EDIT: The issue is #25679

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

vllm-project/vllm#25489 Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

vllm-project/vllm#25489 Signed-off-by: Chendi Xue <Chendi.Xue@intel.com> Signed-off-by: Iryna Boiko <iboiko@habana.ai>

#25489) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

vllm-project#25489) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

vllm-project#25489) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>

vllm-project#25489) Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

MatthewBonanni requested review from ApostaC, DarkLight1337, LucasWilkinson, NickLucche, WoosukKwon, aarnphm, alexm-redhat, comaniac, gshtras, jikunshang, mgoin, njhill, robertgshaw2-redhat, simon-mo, tdoublep, tlrmchlsmth, yewentao256 and ywang96 as code owners September 23, 2025 16:24

mergify bot added ci/build rocm Related to AMD ROCm speculative-decoding v1 tpu Related to Google TPUs kv-connector labels Sep 23, 2025

gemini-code-assist bot reviewed Sep 23, 2025

View reviewed changes

vllm/platforms/tpu.py Outdated Show resolved Hide resolved

tlrmchlsmth reviewed Sep 23, 2025

View reviewed changes

MatthewBonanni commented Sep 23, 2025

View reviewed changes

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 23, 2025

ProExpertProg approved these changes Sep 24, 2025

View reviewed changes

MatthewBonanni and others added 2 commits September 24, 2025 15:45

add XFORMERS, remove fallback (doesn't work with current backend sele…

015356d

…ction logic) Signed-off-by: Matthew Bonanni <mbonanni001@gmail.com>

skip test (justification in PR)

ab51dce

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

mgoin approved these changes Sep 24, 2025

View reviewed changes

mgoin enabled auto-merge (squash) September 24, 2025 22:47

remove test (there are no longer any V0 backends)

dbcc487

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

auto-merge was automatically disabled September 25, 2025 13:14
Head branch was pushed to by a user without write access

MatthewBonanni mentioned this pull request Sep 25, 2025

[CI Failure]: Cascade attention E2E test fails with FlashInfer backend #25679

Open

3 tasks

skip test

9b06f00

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

ProExpertProg enabled auto-merge (squash) September 25, 2025 17:21

ProExpertProg merged commit 3468f17 into vllm-project:main Sep 25, 2025
53 checks passed

MatthewBonanni deleted the remove_suffix branch September 25, 2025 18:03

xuechendi mentioned this pull request Sep 25, 2025

Fix crash introduced by 25489 - cause PD fail vllm-project/vllm-gaudi#260

Merged

xuechendi added a commit to vllm-project/vllm-gaudi that referenced this pull request Sep 25, 2025

Fix crash introduced by 25489 - cause PD fail (#260)

0751bf6

vllm-project/vllm#25489 Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>

vanbasten23 mentioned this pull request Sep 25, 2025

Fix an CI issue due to the removal of _VLLM_V1 suffixes from attention backend names vllm-project/tpu-inference#744

Merged

This was referenced Sep 26, 2025

[Spec decode] automatically disable mm for text-only draft models #25667

Merged

[Multimodal][Speculative Decoding]Eagle Eagle3 mm support, enablement on qwen2.5vl #22872

Merged

This was referenced Sep 29, 2025

[Attention] Refactor CUDA attention backend selection logic #24794

Open

[Attention] Move Backend enum into registry #25893

Merged

Uh oh!

[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names #25489

[V0 deprecation] Remove _VLLM_V1 suffixes from attention backend names #25489

Uh oh!

Conversation

MatthewBonanni commented Sep 23, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

tlrmchlsmth left a comment

Choose a reason for hiding this comment

Uh oh!

MatthewBonanni Sep 23, 2025

Choose a reason for hiding this comment

Uh oh!

tlrmchlsmth Sep 24, 2025

Choose a reason for hiding this comment

Uh oh!

mgoin commented Sep 24, 2025

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

mgoin commented Sep 25, 2025

Uh oh!

MatthewBonanni commented Sep 25, 2025

Uh oh!

MatthewBonanni commented Sep 25, 2025

Uh oh!

MatthewBonanni commented Sep 25, 2025

Uh oh!

ProExpertProg commented Sep 25, 2025

Uh oh!

MatthewBonanni commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MatthewBonanni commented Sep 23, 2025 •

edited by github-actions bot

Loading

MatthewBonanni commented Sep 25, 2025 •

edited

Loading